Skip to content

docs: Skills' parallel code-execution path posture (Avenue A v2 deferral closure)#137

Merged
Number531 merged 1 commit into
mainfrom
feature/codeexec-skills-investigation-outcome-stacked
May 16, 2026
Merged

docs: Skills' parallel code-execution path posture (Avenue A v2 deferral closure)#137
Number531 merged 1 commit into
mainfrom
feature/codeexec-skills-investigation-outcome-stacked

Conversation

@Number531

Copy link
Copy Markdown
Owner

Avenue A v2 follow-up PR 2 of 2 — closes the third defer from PR #135. Documentation-only; no code change. Verdict: Skills' parallel code-execution path is deliberately NOT given Avenue A v2 enforcement.

Stacked on PR #136 (telemetry + docs). When #136 merges to main, this PR rebases cleanly — §12 is purely additive to the end of anthropic-sdk-best-practices-research.md.

TL;DR — DOCUMENT-NO-CHANGE

Explore-agent investigation of src/server/legacyStreamHandler.js:78 (Skills' native code_execution_20250825 registration) traced the entire failure-mode surface and produced a decisive no-code-change verdict:

  • Skills HAS NO envelope contract (output is free-form text + tool-result pass-through)
  • Skills HAS NO retry logic (no equivalent to bridge's MAX_TURNS=3 loop)
  • Skills HAS NO downstream parsed_output consumer (legacyStreamHandler.js only streams text/tool_use_result)
  • Adding output_config would CONFLICT with existing outputFormatSchema logic at the same lines (duplicate format key)
  • Skills is in deprecation track (4 independent signals — see below)

Why this matters

PR #135 (Avenue A v2) shipped with three explicit defers. PR #136 closed two (telemetry + doc corrections). This PR closes the third — but with documentation rather than code, because the investigation revealed that adding Avenue A v2-style enforcement to Skills would actively harm it.

Four signals confirming Skills' deprecation

Signal Evidence
Default off SKILLS_ENABLED=false in flags.env:62
Route gating USE_AGENT_SDK=true (default) → handleAgentStream invoked, NOT handleLegacyStream. Skills path never invoked in production even if its flag were true.
Architectural pivot Commit 12000487 "Step 3 — absorb Anthropic xlsx Skill content into bridge prompt" — Skills' use case explicitly moved INTO the bridge
Code comment featureFlags.js:14-17: "Custom skills disabled: Non-SDK-compliant format with no-op code. Subagents handle all domain expertise."

Why output_config would harm Skills (not help)

  1. JS object-literal duplicate-key collision with existing outputFormatSchema at legacyStreamHandler.js:112-115. JavaScript silently uses the LAST format key — user-requested structured output and Skills enforcement cannot coexist.

  2. No parsed_output consumer: legacyStreamHandler.js only inspects text/tool_use_result blocks. Bridge's extractResults reads parsed_output; Skills' handler doesn't. The API would generate + bill structured output, but the codebase would ignore it.

  3. Truncation cliff worse than xlsx's: Skills outputs (PDF extracts, OCR) are often LARGE. Skills has NO secondary fallback channel like xlsx's stdout. Forced text-channel JSON enforcement could hit max_tokens mid-output — re-introducing the L4 v1 failure mode with NO architectural mitigation.

Bridge vs Skills — canonical reference table

The §12 addition includes this table for future reference:

Aspect Bridge Skills
Envelope contract ENVELOPE_SCHEMA_XLSX / ENVELOPE_SCHEMA_GENERAL None
Validation selectEnvelopeWithFallback, isCompleteXlsxB64 None — pass-through
Retry logic MAX_TURNS=3 corrective loop None (terminal failure)
Structured-output enforcement YES (Avenue A v2 via output_config) N/A — no contract to enforce
Tool type code_execution_20260120 code_execution_20250825
Default state in prod Active (CODE_EXECUTION_BRIDGE=true) Inactive (SKILLS_ENABLED=false + USE_AGENT_SDK=true)
Reads response.parsed_output YES (Avenue A v2 path) NO

For structured, multi-turn analysis with envelope validation → use the bridge. Skills are appropriate only for stateless, single-turn pass-through operations against Anthropic-managed domain expertise (pdf, xlsx, docx).

Files changed (2)

File Δ Content
docs/code-execution-enhancements/anthropic-sdk-best-practices-research.md +87 LOC New §12 "Skills' parallel code-execution path: why Avenue A v2 does NOT apply (2026-05-16 investigation)" with bridge-vs-Skills comparison table
CHANGELOG.md +15 LOC [Unreleased] entry under "Documented" closing follow-up #3

Verification

Future direction

If custom skills become SDK-compliant in the future, re-evaluate under Programmatic Tool Calling (PTC) using code_execution_20260120 + allowed_callers. At that point, Avenue A v2-style enforcement could be applied at that distinct surface. Until then: bridge is canonical, Skills is deprecated pass-through.

Rollback

Trivial — git revert <merge-sha> removes the §12 doc additions and the CHANGELOG entry. No code, schema, or behavior implications.

Sequencing

🤖 Generated with Claude Code

…ral closure)

Closes the third explicit defer from PR #135 (Avenue A v2). An Explore-agent
investigation traced Skills' code-execution registration path at
src/server/legacyStreamHandler.js:78 and reached a decisive DOCUMENT-NO-CHANGE
verdict: Avenue A v2 deliberately does NOT apply to Skills.

═══════════════════════════════════════════════════════════════════════
RATIONALE
═══════════════════════════════════════════════════════════════════════

The codebase has TWO independent code-execution surfaces against Anthropic
Messages API:

  Bridge        (src/tools/codeExecutionBridge.js:30)
  ───────────────────────────────────────────────────
    Tool:      code_execution_20260120
    Pattern:   Orchestrator-driven multi-turn with envelope contract
    Validate:  isCompleteXlsxB64, selectEnvelopeWithFallback
    Retry:     MAX_TURNS=3 corrective loop
    Avenue A v2: APPLIED (output_config + json_schema enforcement)

  Skills (native)  (src/server/legacyStreamHandler.js:78)
  ───────────────────────────────────────────────────────
    Tool:      code_execution_20250825
    Pattern:   Single-turn streaming pass-through to frontend
    Validate:  None — free-form text + tool_use_result blocks
    Retry:     None (terminal failure on malformed output)
    Avenue A v2: NOT APPLICABLE

Skills doesn't exhibit the envelope-miss failure mode Avenue A v2 fixed because
Skills HAS NO ENVELOPE CONTRACT to enforce. Skills output is free-form text
accumulated and streamed to caller; there's no schema, no validation, no retry.

═══════════════════════════════════════════════════════════════════════
WHY output_config ENFORCEMENT WOULD ACTIVELY HARM SKILLS
═══════════════════════════════════════════════════════════════════════

1. CONFLICT with existing outputFormatSchema logic at lines 112-115:
   output_config: { effort: 'high', ...(outputFormatSchema ? {format}: {}) }
   Adding ...(skills ? {format: skillsSchema} : {}) creates duplicate `format`
   key — JS uses the LAST one, silently overwriting user-requested structured
   output enforcement.

2. NO downstream parsed_output consumer:
   - Bridge reads response.parsed_output (codeExecutionBridge.js:extractResults)
   - legacyStreamHandler streams text + tool_use_result only — never inspects
     parsed_output
   - Structured output would be generated + billed in tokens but ignored

3. TRUNCATION RISK worse than xlsx's:
   - Skills outputs (PDF extracts, document parses, OCR) are often LARGE
   - Skills has NO secondary fallback channel (xlsx had stdout to fall back on)
   - Forced text-channel JSON enforcement could hit max_tokens cliff with NO
     architectural mitigation — re-introduces the L4 v1 failure mode

═══════════════════════════════════════════════════════════════════════
SKILLS IS ARCHITECTURAL DEAD CODE IN PROD
═══════════════════════════════════════════════════════════════════════

Four independent signals:

  1. SKILLS_ENABLED=false in flags.env:62 (production default)
  2. USE_AGENT_SDK=true in flags.env (production default) — routes /api/stream
     to handleAgentStream, NOT handleLegacyStream. Skills path never invoked
     in production even hypothetically.
  3. Architectural pivot commit 1200048: "Step 3 — absorb Anthropic xlsx
     Skill content into bridge prompt" — deliberately moved xlsx skill
     functionality INTO the bridge, replacing Skills-API approach
  4. Code comment in featureFlags.js:14-17 explicitly states "Custom skills
     disabled: Non-SDK-compliant format with no-op code"

═══════════════════════════════════════════════════════════════════════
DELIVERABLE
═══════════════════════════════════════════════════════════════════════

This PR is documentation-only:

  docs/code-execution-enhancements/anthropic-sdk-best-practices-research.md
    +§12 "Skills' parallel code-execution path: why Avenue A v2 does NOT
    apply (2026-05-16 investigation)" — includes bridge-vs-Skills output-
    contract comparison table for future reference

  CHANGELOG.md
    [Unreleased] entry under "Documented" — closes follow-up #3 with the
    architectural posture rationale

═══════════════════════════════════════════════════════════════════════
FUTURE DIRECTION
═══════════════════════════════════════════════════════════════════════

If custom skills become SDK-compliant in the future, re-evaluate under
Programmatic Tool Calling (PTC) using code_execution_20260120 +
allowed_callers. At that point, Avenue A v2-style enforcement could be
applied at that distinct surface. Until then, Skills remains a deprecated
pass-through; bridge remains the canonical structured-output surface.

═══════════════════════════════════════════════════════════════════════
STACKED ON PR #136
═══════════════════════════════════════════════════════════════════════

This branch is stacked on feature/codeexec-avenue-a-v2-telemetry-docs (PR
#136) so the §12 addition cleanly follows the §11 (empirical constraints)
content added there. When PR #136 merges to main, this PR will need to be
rebased onto main; the rebase is conflict-free since §12 is purely additive
to the end of the doc file.

VERIFICATION:
  Suite: 202/0/2 (PR #136's baseline maintained; no code change in this PR)

ROLLBACK: trivial — `git revert <merge-sha>` removes the doc additions.
No code, schema, or behavior implications.

Plan: /Users/ej/.claude/plans/glittery-toasting-stardust.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant